Exploiting Structure for Intelligent Web Search
نویسنده
چکیده
Together with the rapidly growing amount of online data we register an immense need for intelligent search engines that access a restricted amount of data as found in intranets or other limited domains. This sort of search engines must go beyond simple keyword indexing/matching, but they also have to be easily adaptable to new domains without huge costs. This paper presents a mechanism that addresses both of these points: first of all, the internal document structure is being used to extract concepts which impose a directorylike structure on the documents similar to those found in classified directories. Furthermore, this is done in an efficient way which is largely language independent and does not make assumptions about the document structure.
منابع مشابه
Developing Intelligent Search Engines
Developers of search engines today do not only face technical problems such as designing an efficient crawler or distributing search requests among servers. Search has become a problem of identifying reliable information in an adversarial environment. Since the web is used for purposes as diverse as trade, communication, and advertisement search engines need to be able to distinguish different ...
متن کاملModeling and Exploiting User Search Behavior for Information Retrieval
The ongoing explosion of web information calls for more intelligent and personalized methods towards better search result quality for advanced queries. Query logs and click streams obtained from web browsers or search engines can contribute to better quality by exploiting the implicitly embedded preferences and aggregated recommendations. This work addresses approaches for incorporating implici...
متن کاملA New Hybrid Method for Web Pages Ranking in Search Engines
There are many algorithms for optimizing the search engine results, ranking takes place according to one or more parameters such as; Backward Links, Forward Links, Content, click through rate and etc. The quality and performance of these algorithms depend on the listed parameters. The ranking is one of the most important components of the search engine that represents the degree of the vitality...
متن کاملExploiting Personal Search History to Improve Search Accuracy
Personal search history is an important type of personal information, from which we can learn a user’s interests and information needs, thus improve the search service for the user. In this paper, we describe our recent work on User-Centered Adaptive Information Retrieval (UCAIR), which aims at capturing personal search history with a client-side search agent and exploiting the history informat...
متن کاملIntelligent Web Crawling
Web crawling, a process of collecting web pages in an automated manner, is the primary and ubiquitous operation used by a large number of web systems and agents starting from a simple program for website backup to a major web search engine. Due to an astronomical amount of data already published on the Web and ongoing exponential growth of web content, any party that want to take advantage of m...
متن کامل